0) 

o 



w 



fa 
o 

u 

H 

I 

a 

CO 



4J 

d a 
o -h 

-H o 
4J & 
Pi 

-H rH 

^ «J 
u a 

CO -H 

§ U 
U O 



— > 



< rH 

nj 

-rH cn 
o -h 
a at 



o 

■ u 
a 

H 



a 
o 
■ u 

d 



4-) 



d 
o 

-H 

4J CO 

i— I u 

to u 

d cc 

m 4J 

)H CO 



— > 



8- 



3- 



d 




0 




-H 


<D 


4J 


4J 


a 


-H 


-H 


CO 


U 




u 


4-> 


CO 




a 




«j 


4J 


in 


CO 


H 









0 




4J 




U 




m co 




4-1 <D 






4J 






C *H 






0 CQ 








u cn — 




a d 




•H -H 




^ T> 




U d 






CQ -H 






d a 






m 




M 




H 


1 



C 10 

o a 

■H o 

cn n 

<D 4J 
Pi d 

•H 

tn 

d 43 

•H 4J 
TJ -H 
O 5 

a — 



a) 
u 
d 
aj 
-43 
d 
W 



a) 
u 

— a 
m 

d 



<U 
(TJ CD 
<U 
4J 43 

JJ 4-t 

x o 



o 

u 
a 
o 

(TJ u 



H 4J 
O (TJ 
4-1 43 
4J 



d 
o 
u to 
d 
d -h 

0) 

al O 
6 H 

o ft 

43 4H 
U O 

to 

■H 

to 

(TJ 

H cn 

u d 

O -h 

4-1 TJ 

d d 
o *h 
u 43 

S d 

Q (« 

o d 

H O 
4-1 -H 
H JJ 
O (TJ 
0) rH 
ft >t 

CO 43 



JJ QJ 

tn 

to as 
4-i 43 

■H 

■P 
O 

e 

to 

a) 
u 
d 



O 



in 

CM 



FIGURE 2 

-4242 GCATGCACTG CCACAAGTAG TGAACTCATG GTTTTACCTC CTCAAGTAGA 

-4192 AAACCTTTTG AGTGAATTTG AAGATTTATT CTCCCAAGAA GGACCCATTG 

-4142 GGCTTCCTCC TCTTAGGGGG ATAGAACATC AAATTGACTT TATACCGGGG 

-4092 GCAAGCCTAC CAAATAGGCC TCCTTATAGA ACCAACCCCG AGGAAACAAA 

-4042 GGAGATAGAA TCACAAGTTC AAGACTTGTT GGAGAAGGGT TGGGTTCAAA 

-3 992 AGAGCCTAAG CCCTTGTGCT GTACCTGTCT TGTTGGTGCC AAAAAAAGAT 

-3942 GGAAAATGGC GTATGTGTTG TGATTGTAGA GCAATCAACA ACATCACCAT 

-3892 CAAGTATAGG CATCCAATCC CAAGGCTTGA CGATATGCTT GATGAATTGC 

-3842 ATGGGTCAAC TCTATTCTCC AAAATTGACC TTAAAAGTGG ATATCACCAA 

-3792 ATTCGAATCA AGGAGGGTGA TGAGTGGAAA ACCGCTTTTA AGACCAAATT 

-3742 TGGATTATAT GAGTGGTTGG TGATGCCCTT TGGTCTTACT AACGCTCCAA 

-3 692 GTACATTCAT GAGGCTTATG AATCACACCT TGAGGGATTG TATAGGTAAA 

-3642 TATGTAGTAG TTTATTTTGA TGATATCTTA GTATATAGTA AAACCCTAGA 

-3592 AGACCATCTA AGTCACCTTA GGGAAGTTCT TCTAGTTCTT AGGAAAAATA 

-3542 GTCTTTTTGC CAATAGGGAT AAGTGTACCT TTTGTGTAGA TAGCGTAGTC 

-34 92 TTTTTAGGCT TTATAGTAAA CCAAAAGGGG GTGCATGTAG ATCCCGAGAA 

-3442 AATCAAAGCC ATCCGCGAGT GGCCAACTCC ACAAAATGTA AGTGATGTGA 

-33 92 GAAGTTTTCA TGGGTTAGCT AGCTTCTATA GAAGGTTTGT TCCCAATTTT 

-3342 TCTAGCCTAG CTTCTCCCTT GAATGAACTT GTAAAAAAAG ATGTTGCATT 

-32 92 TTGTTGGAAT GAAAAGCATG AGCAAGCCTT TCAAAGGCTA AAAGCTCACT 

-3242 CACCAATGCA CCCATCCTAT CTCTTCCAAA TTTTTC CAAA CTTTTGGAGA 

-3192 TAGAGTGTGA TGCATCGGGA GTAGGCATAG TGCGGTTTTG TTGCAAGGTG 

-3142 GACACCCCTT GCTTATTTTA GTGAAAAACT CCATGGTGCC ACCCTCACTA 

-3 092 CCCCACCTAT GACAAAGACT CTATGCTCTT GTGCGACCCT AAAGACTTGG 

-3042 GGAACACTAC CTTGnGTCCC AAAGAATTTG GnTATC CATA GTGATCACGA 

-2 992 GTCTTTAAAA TATTTAAAGG GCCAACACAA GCTCAATAAG AGACATGCTA 

-2942 AATGGATGGA ATTTCTTGAA CAATTTCCTT ATGTCATCAA ATACAAGAAA 

-2892 GGGAGCACCA ATATAGTGGC CGATGCTCTT TCTAGACGGC ACACTCTCTT 

-2842 TTCAAAACTA GGTGCCCAAA TTCTTGGATT TGAC CACATA AGAGAGCTTT 

-2792 ATCAAGAAGA TCAAGAACTC TCATCCATCT ATGCCCAATG TCTACATAGA 

-2742 GCACAAGGAG GTTACTATGT GTCCGAGGGA TATCTTTTTA AAGAAGGAAA 

-2692 ACTTTGCATT CCCCAAGGAA CACATAGAAA ACTCCTTGTC AAAGAATCAC 

-2642 ATGAAGGGGG ACTCATGGGC CATTTTGGAG TTGATAAAAC TCTAGACTTT 

-2592 TAAAAGCAAA ATTTTGTTGG CCACACATGA GGAAAGATGT CCACGACATT 

-2542 GTCTAGAGTA TCTCATGTTT AAAAGCAAAG TCTAGAACAA TGCCGCTGGA 

-2492 CTCTACACCC CTTTGCCGAT TGCAAAGCTC CTTGTGAAGA CATTAGCATG 

-2442 GATTTCATTT TAGGACTTCC TAGGACTGCA AGAGGCCATG ACTCTATCTT 

-23 92 TGTGGTAGTG GACCGTTTTA GCAAAATGTC TCACTTTATT CCATGCCACA 

-2342 AAGTAGATGA TGCTCAAAAT ATTTCTAAAC TCTTCTTTAG AGAAGTGGTG 

-22 92 AGACTCCATG GTCTCCCTAG AAGTATAGTG TCCGATAGAG ATCACCTTAA 

-2242 ATATATAATT ATACACTTGT TTTTTTTCTC TTTTTTATTT TATCAAGTAA 

-2192 AAAGTATTTG TTCTAGATTA TTATGAGTAT ATACTTACTT TCTGTATTTC 

-2142 ATTTCTTTCT ATTTTTTATG ACGATGAAAT TTCTTATTAT ATCCAGACTT 

-2 092 TTCATATATA TTTTTATTTC TTTTCCATCT AGATGCTCTG TACTTTTCTT 

-2042 CAGTTGAAAT TTCCACTCTC CAACAAAACA TCATTCAAGT TTTGTATAAC 

-1992 ACTGTGACGT TAACCAGTTA AAATAAGAAA ATCATGTAAT ATAAATTATT 

-1942 TCAGTAGATA TTTTAGAATT ACAAATACGA TAAATAATTA AATTTAAAAA 

-1892 ATTATTAAAC AATGAATTTT TTTGGAAATT AATATAAAAC TTAGACTTGT 

-1842 GGTTTCTTCA TTCAGTCAAA ACCTTTTTCT ATTGTGTGGC GTGTGCGTGA 

-1792 ACATCGAATT TGGGTGCTTT ATGCCGCTTT ATCTTCATCT GCACCTTCAA 

,-1742 ATTAATAATT TAATTCCGGA AAATAATAAA CCCACACACT GTTTTATGCA 

-1692 TATATTAAGA TAAATAAAAG AGAACTATTT TAAAGAATAT AAAATAATAA 

-1642 ATGTAACAAA TGATGTCACT AAAGAAGAAA AAAATTAACA AGAATTGTAA 

--15 92~ ~ T ATATTTCTT - TATGAAATGT ~~ TTTGTGCATT ~ ACCG AGAGAG" "GTCGAACATG" 

-1542 ATACACGCAA GCATCTAACT AGTTTGGTAA TTCCTTTTCA ACATCGnTAA 

-14 92 GCACATCACA CTAAAATTAC TTTAAATAGA TAAATTAGAT TCAATTGGAT 

-1442 GACATTAATT TATAATACTC TATCCAAAAT TATAACTATA AATAAAAAGT 

-13 92 TATTTTTAGA AAATAAGTAA TGAAAATTTA ATTCTAAAAT TTATAACACT 

-1342 TTTATGCTGT GTTTGTTTCG AAGCATAGAA AAATAAAAAG TTATTGTTGG 



-1292 GAATGAAAAG TGAAGAAAAT CATGTAATAA AAACAAAATG ACACGACAAT 

-1242* CAAAAAAAAA GTTTTCATGC AAAACTTTTT TCAAAATTTA CACTTTTATG 

-1192 ATGTGTTTGT TTCGAAGTGT AGAAAAACGA AAAGTTATTA TTGGTAATGA 

-1142 AAAGCGAAGA AAATCACGTA ATAAAAACAA AGCAAGATGG CACGACAATC 

5 -1092 AAAAAAAAGT TTCTACACAA AACTTTATTC AAAATTTACA ACACTTTTAT 

-1042 GTTGTTGTTT GTTTCCGAGG TATAGAAAAA CAAAGAATTA GTGTTGGTAA 

-992 TGAAAAGTGA AG AAAAC CAT GTAATGAAAA CAAAATGGCA CGACAATCAA 

-942 AAAAAGTTTT CACGCAAAAT TTTCTTCAAA ATTTATAACA TTTTCATGTT 

-892 GTGTTTGTTT CAAAGCCTAG AAAAACGAAG AGTTACTATT GGTAATGAAA 

10 -842 AGCGAAGAAA ACCACATAAT AAAAACAAAA TGGCACGACA ATCAAGAAAA 

-792 AGTTTTCACA CAAAACTTTT TTCAAAATTT ACTATGTTTA TTTCGAAATT 

-742 TAGAAAAACG AAGAGTTATT ATTAGTAATG AAAAGCGAAG AAAACTACGT 

-692 AATAAAAAAC AAAATGGCAC GACAATAAAA AAAGTTTTCA CGCAAAATTT 

-642 TCTTGGTGCG CAGAAAGTTA TATATATTAA TTAATTAATT TTCATTTACT 

15 -592 TTTTTCCCTT TTTATTTTAA AGTTAAATTA TTATTATTTT CATTTAAAAT 

-542 ATAAATATTA TTTAAATATA AAAAATATAA CCTTAATCAA AACAAAGCCT 

-492 TAATCTAAAA TTTACAACAC TTTTAACCTT AAAATTAACT TTAAAAGGAA 

-442 AATGATAGTG TGACAACTAA AAAAGTTGTA TACAACCCTG TCATAGGTTT 

-392 AGAAATAAAT ATATATAATA AAGAGTAAAT TTGTAATTAA ATGATATAAA 

20 -342 AAAGTATTAA AATAATAATA TTTAGAGTAG TAATATGGTT GTATAAAAAA 

-292 ATGTGGTTGT CCATATATCA TTATTCACTT TAAAATATCA TGACAAATAT 

lSKi -242 TTTCACCGAA AGATGGAAAG AACGAAAAGA GCGTTGGATA ATGGAAAAAT 

ls * -192 ACAAGCAATC TCCCTCCAGT ACTTTGCATA ACATTTTGTA TTAGTGATGA 

H -142 GTTTTTTATC ATATATATTT AGAATATAGG AAAATTTTAG AATCACGTGG 

12*5 -92 ATAGCTATAT AATAGTAATA TTTTAATTTA TAATGTAGTT GATTTTATTT 

ri? -42 GTCAACTGGT ATACATAAAT ATGTGTTGAT AGTGGGTGAC TTGTGGCTTA 

9 AAGAAATGTC CAGAGGCTGA CAACAACTCT GCACAGACTA GCGTAAAC 

ITf 57 ATG AAG TCC AAT TTT GCT ATT TTC GTA GTC TTT TCT CTT CTT CTT 

Ui 1MKSNFAIFVVFSLLL 

3J) 102 CTG GTACCTCTTCAATCTTCTCTACAAAAACTCTGTTGCTCTTTCACCTCTGTTTGTA 

16 L 

160 ATTTTGTTTACACTTTTGGAAAATTGAAGCTGATATATATGTAACAACCTTTCAGTTTT 

219 GTCTGCACTGAAACTGATAGAAAAATATACGTTTTGTGGATATATATAG GTT GGC 
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493 AGACTATGTGATTGGCAGTTTCAGACTTATTTGGCACCAAATTTATGATGCTCTTGTTGCTG 

555 TTTCAAAATTTGTACTCAAACTTTGAACCCTTTGCAGCATCTTGCTTCTTTTTGGTCTTGCT 

617 GAATTTTGTCACAGTTATACTGTCACGAATAGTTTCTCTTCATAATAAGCAACTTTTCCTCT 
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FIGURE 3 



101001 CAAAACAAAAGCAAATGCCGGTTTTCTTATTATTATTTCGAACTTTAGAC 
5 100151 CTTTTTGTAACGTTTCTTTAATTTTTTTCCTTGATAAAGAACCCTATTAT 
100201 ATCTTAGCTAAATATTTACCTCATTTTGTTTATGAGCTAAACCACCCCAA 
100251 AAATATTGTAGTTTTGCTTTCGGATTTAACTGCCAAGCAAGTGATTAGAT 
100301 ATATTAAAGGAAAATGAATGAAAGGACAAAAAAATATAAACGACAATATT 
100351 TGAATACTGATATTTATCTCCATTCTCAAATATTTTTGATTTATTGTGAC 
10 100401 AATATTTGGTTGTTTCCCATTTGCTACATCTTTGAGGACATGAAATGATA 
100451 ACATATATATGAACGAGTATAATACATTCTCGTTTCATTTTACAAATAAT 
100501 GTCAATTTATGCTAACATTTTTTATTTAAAAATTATCCTTATAAGATTTC 
100551 AGTGTATTATTTTACCATGGTACTGTAAAGTCGGATGCTATATATATATA 
100601 TATATATATATATATCAAAAATGACACTGAAGAATTTATTTGAACTAAAA 
15 100651 CTAAAAACGTAAAATAAAAAGAATTTTTCAAAAATCAAAAATTTTATATA 
100701 AAAATATAGATAAAATGTTAATATAGTACAACTTCTATTCAAACAGAGAG 
100751 AATAAATCTTCTATAGACAGTGAATATCCATTATAATAACGAGCAATAGT 
100801 TGTAATGTTGCAGTACAAAAAGAGAATTGTAATATTTGTGCATGATTGAG 
100851 AAATCTAAGTTGACTTTGAATTAAAAGGCTAATTCCAACAAGTACATGTA 
20 100901 GAAGTTGACTATAGCTATATATTTACTACAAATTGATCATTTCAAGAAAG 
;; 3 100951 ACATTTAAATTAAGATATGCATGCATGACTTGATTGAACCCCACTCGCTT 
Q 101001 GCTTCGTGCCATTCGACAAGATGTTACTTTTAAATGCAAGGTAAATTATG 
V j 101051 GATATACTCTTCTGTATTTTTTGTAGTAGATATTTTTACGAAAATTGTTT 
r;j 101101 TTTTTCCAAAATCAAATGATATTTATTAATTTTCAATATAGAATTAATTA 
i5 101151 AATTTTAATTAATTTTGAAGATTTATATGCTGCAGATTAGATTACCATTG 
Tl 1012 01 GTG AAATCATGTTT AGGTAAATAAT AAATGATGTTGT AGTTTAGGAAAAA 
101251 AAAAAATTCTTTAATCTTTATGTAAGAATGTTAAACTTCAATTATAAAAA 
Yl 1013 01 TATGAAGCAGTATTATATAAGATGTTTAACTAATCGAATAATATTTTTTG 
^ 101351 GGATGAAATTTTCTTGCATATGTTTCTAAAAAAATAATATGTGAAAAATT 
30 101401 AAC ATTC ATTGTATGTTT ATAAGAAATAT ATGTGAGTTTTGTTTAGAT AA 
M s 101451 ATAATACTTAAAATTAAGAATTTGTAAAGTTATACTGCACTTCAAATATG 
y= 101501 TTATTTTTTCCTTTTATTTAAAATATCAGCAACATTCTAAATGATTTTAT 
pis 101551 TTTCTTTAAAAAATTGAAAAAATGAAATTAGCAAATATGTAAAATTTAAA 
101601 ACGAATTTAAGAAAAAACTTTGTAAAGATATGATATGCTTTATAAAAAAA 
35 101651 ACTTGGTGGCGTACCTACTAAATATGATCACATTAGAGATTTGTATCCTT 
I;?! 101701 TAGCATATAGTATGTAGTATAGATATCTATATTTTTATTTATTAAAGAGC 
U 101751 ATATTCATAATATAGGTATTATATGTTAATTACAATAAACGTTCAATTCG 
101801 TTATGTTAGTTTTTAGAAAACTTATTGCGTGTGCATATCAATGTGAGAAA 
101851 GCGACTCCACATGTGAGATGTTGGTCTGAGAAAGCTTTCTGCACTTGGTC 
40 101901 GGAACTACTTCATGGACTAGAATGCAATCCATCTATTCAAAGAAAAGCAG 
101951 TTGTCCATGCATGCCTCGGTTTTTCACATTTGGAAGCAGCGCAACAATGT 
102001 CTTACATAATATGCGATCGATCACTCTGCAACCAATATTCAAGTACATAG 
102051 ACCATGACATCAAAAACATTATCACACCGAGAAGAAAGAAACGTCAATTT 
102101 GGTAACTTAATGGCGTTATGCCTGCGGTGAATTCTCCTAAGAGTTCTCCC 
45 102151 AAATTTTATTGATTCCTTGTTTTTAACTTTTTCGCCAAAGAATCATACAT 
102201 ATAGATTTGACACCATTTCAACTTATCAAATACAAGTGAATAAATAATTT 
102251 CAAGCTTGAAAGGAATTTAATCATGATCTAAACCTAAACGACAAATTCTT 
102301 CACAAGTGAGAATCACTAATTGACTACCCCTTGGTCGCATATACATCATT 
1023 51 GTTGTAAATCTGAAAATTGGTTTGGATTTGATCTGATATGTCATTCATAT 
50 102401 AAAACTTGTATTATTTATTTTAGAATTTTGCCGCAAACAGATAAATCATC 
102451 ATCTATTTAGAAAATTTTCATTTGCACCACAATTAATCAGGGGAAAAGGT 
102501 GAAATCACATATCTTATCTACACTCTTTATTAATTAAACGCCATAATATA 
102551 ACAAATTTTCAAATACCACTTATGAGAAGCACTAAGATCACCTTTTTCTT 

102601 TATGACT-TTCTTTCTAAAGCTAAGCTGGTAGTCATGAGTGATGATTATGC 

55 102651 TTTTCCTAATGGGAATATTGTGGAAGCGGTTTCAAATCTTTAGACAAAAT 
102701 TCCATGGCCACTAAAAGTTAGCAAAGTTAAAATAAGTTTAAAAAAATATG 
102751 AGTGTACTTGGCCATATGCCATATTGTTGAGATCATAACAAGAGAAATAA 
102801 TAGTTTATTGAAGTTTAGATCATAATCACAATACATCATTGCCTTCATCA 
102851 ACATTTTCCATGGATTTGAGAGGATCAACTTCAATACTAATGGTGGGGTC 



102 901 TTATTCATCCATTGCTCTCTAGCCAATTAAGCAGTTAGGTTATTTGTGTA 

102 951 CTCTAGTAGTTGCCAAATCAATCTTAATATTCACAATGTTGTAATTTCTA 
103001 ATTACGTATAGATAAATGACTAGATAACACGTGGCTTTGGTTTTATCAGG 
103051 AAAGTTTTCCAAATCATATATATGAATGTAGAATAGTGTTCTTCATTAAT 
103101 TATTAATTAGCATCTCACCATCTGAGACTGGGAGCATGTGACAAGTTGAC 
103151 ATGTGTATTAAGAGAACTTTGAGAAAACCACTTTTATGATACTCCCATCT 
103201 GAGACTGGGATGAGTACCATTTTATAAAAATATGAGTAGTGAAAAAATAT 
103251 TCAAAAAAAATTCTAACATGTCCTTTAAAACATTTTAACCTTATAATTTT 
103301 AACAAACATCTTCCAATATGCGTTATGAAAACTTTATAAAACTTTTTTAT 
103351 AACATGCTTTTGAAAATTTTATAAATCTGTATTTTTAGAAACAAAGTGAT 
103401 ACTTTTGAAAATAGACAAATGAAGTGCTATTTTTTAAAATTGATATCATA 
103451 AGTCTTAACTGTGGTTTGTTTGAATTTTATTTATATACTTGTCAAAATAA 
103501 AACTAAATAAATAAATTAAATTATTTTATAATCATGAAGATAATATTATC 
103551 ATAAAAGATAAATATAAAATCAACAAATTTATATTTGTTAATAAAAATAC 
103601 TTTGAGCTCTTCTTCATAAGACTTTTCCAGCTTCCATCTAGAAAATCACA 
103651 TAAATTAAAAGATAAATAACCGAATAAACATAGTTCACATTCTAACTCTT 
103701 AGTCTTAGATTTGTTTTAATTTTCAAAGGTTTAGGTATTGTATATGTTTT 
103751 TTTTATTGGGTTGCTAGATTTTGATCCAAGAAGAAATGACGGGTTGTAGT 

103 801 ATAGATGGTTTGTTTGAGTTTTTTCCCCTTGGTTTACTTCGTTTGGTTTT 
103851 TGTCCCCAGAATTGTTCTTGTACTCGCTGGTTTATGTCTCTACAAAGTCC 
103901 ACGACCATTGCCGGCTCTTTGTATTTCAACTTGAATTCTAAATTCGATTG 
103951 ATGAAAAAAAAATGTATCTCTTAAAGTCCATTAGTACCAAAAATAACTAT 
104001 ATCATTACTACATAAAATAGTCTTGGGTTTTCCAAAGTATTTCGTTGATA 
104051 TATGTTAAGAGTTCGAAATAGACACATAGATATAATGTTGAAATGGGACC 
104101 TCTCACATAATTATCTCCTTTTCTCTTCATTTCTCTACCTCTCAAGTTTC 
104151 CAATCCCACCCTAAGGTAATTTATTTCTTAACCTAAGTAAATTTGTTAAC 

1042 01 AAATCTTAACTAGCTACAAATGTGTATTACAAGTCTTAAATAAAAACCTA 
104251 CTTTAATTCAAAGGTATTAAACCTTCCTAAATTGATACTTACTTAGTATC 

1043 01 GATCGGTCTAGTTTAGGGTTTGGACAACACACCATCATGGGGACGAAATT 
104351 AGTCATTCTACGGTGTCCAAGACACAAATCTCGGACTCGATGTGGATATG 
104401 ACACTTCATTATAACTTTTAACTTCATAAAAACTAACTATTAGGAGGAAG 
104451 AATCGGAATCTGCATATCAATCACAATAGACTATAGTATACTTAGATTTT 
104501 GATCTAATCAATGGGCTCCTTCAACTAATAAGTAGCCCACTACCAATAAT 
104551 GAAATCATAAGACATTATTAAATTAATCAATGTTCTAAAAATACTTTGGT 
104601 TATGTGTCCCGTAGAGCTAATGTGCACACACAATGAAAGTTGACCCGTTT 
104651 CACTTGTCCCACTTTTATGATCTTTTCTTTTAGGTTAAATCCAACTTTTA 
104701 TAATCTCATCTTGTTATCAAACAAAACTTTTGGCCTGTCTTTTTCATAAT 
104751 TTAAAGTAACTCTCACGGAGAAAAGCCAACATTTTCTTCTTGTTTTATTC 
104801 TTTTTAAGAAAAATGAATTCAAGGGGACCCCAAATTTAAAAGGAAAACCA 
104851 AAACTCCTTTCTATGTATTTATTACTTGAAGTTTTCTATGTAATCAACAA 
104901 TCCTAACAGTAGAGAATAAAAAACATCGTTTTGGGAGGTTTTATATTAGC 
104951 ATATGAGAATAGTTCTAAAATTGTTTTACAC AAAAATT AGATTTTCTTTT 
105001 CCTCTGTCAATGGAGCTATATCACTTGTCATTTTGCTTAACCCTTTGCGG 
105051 GAAGATTGTTATGAAACAGTTTTAATGGAATTCTAGTTGCCAATGTCACG 
105101 TTTAATATGTTTTGTCCCTATACTTTATTGAATCTTATAATCTTTGTTAT 
105151 AGAATTATCTACTTTTAGTATTTTACATTAACATAATCTATAGAATTCTT 
105201 CTTTGTTCTATACAATTAAACAAGTAATATATTCTTAATACATATTAAAA 
105251 ATGGTGGTGTTGCTATCTGAGCTGTAATAGTTGATTGCTCCAGAGAAGAA 
105301 TAGACAAAAATCCTTACTTAAGAGGCCCACCACTCTGAAAATTTAGACAA 
105351 GAAAAATTAAACAAAATTAGGTTACACATATTATCATTTATATATATGCA 
105401 CAACACAAAGTTGACCTTGCAATGTACTATTGAATAAAATAAATAAATGC 
105451 AAGAAGAGAGGGAATTATCACTGTTACCAAGAAAACAACTTCCTCTAAAC 
105501 AGGTCTCTATATATATAAACTTTAACACCTAAAGAATTAACACAGATCAA 
105551 GAAAAAATCCTCAAAACAAAAGTTAAAGCAGAC ATG AAG CAA CAG CAA 

1 M K Q ,Q, _Q_ 

105599" CGT TAC TTG GTC GTC TTC ATC GTC CTT TTA AGC TTT CTT 

6RYLVVF IVLLS FL 
10563 8 CTG GTAAAGCTTCTTCCTTAATTATATTAAAACCCTAATTAAGATCTCATATA 

19 L 

105691 TCTGAATGTTGTATATATTTGTTGGTATAG TTT GTG AAT CTG AGT 
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GTGAGTTTTGTGTGATGAAAATTAGATTTGCTTCACATTTTGTTTTGATA 
TATATAAATCAATATACTGTGCCTTTCGTGTCTTGTTTCTTATATTATTT 
TGTGACATTAATTAATTATCTTATCAAAAATTTATTTTATTAACTGTGTC 
CTATGGAAAAAGATGAACAATATGAGTTAACCTCATCTCAAGGAGATTCT 
TTTTTGTTTTGTTTTTC 




FIGURE 4 

1 AAGCTTTACAAATGTCCCCCAAAGATGAAACCACGTTATTATTAGTAAATCCTGAAAAGG 

6 1 TTAACGCTTCTGTTCCTCGAATTCTAAACCATCTGAAATATCTAGTGGTTTAAAATGGAG 
5 121 ACTTGAGGATATAGTCTCCTGAACCAGCTGTCACGGCTGAGTTAGATAACATTACTGAAT 
181 TTCTACGGGAGCGGTTGAAATCACTTTCGCCCCTTTAAGAAGAAGCCTACACCGGGCACC 
241 TTCTTTACGCAATTCGAAATTTAGTCTTGCCAGGCAGTCGTTGGATCGAAGGTCTTTTTC 
301 GATACCGAGGAATCTGACTTTGCAAGGAATAATTCCTAATCACACCACCCCAACCCCTGA 
361 ATACACTTCAGGACCCTCTGAAACCAACTTCGTTTCGGCTAAATCACAAGAATCTCCCAC 
10 421 TCATTCCGATTTTAGCCAATTAAATATGATATCGGTCTGGGAAGCCGATAAGGAAATTCT 
4 81 ACAAAAAGAGTTTATGAATGAGGAAAATAAGGAAAAGAGAGAACTATTTTTTAGGTACCC 
541 TGAAAGAGAACGAGAAAAATTTAGAAAAAAATACTACTCTCATCTGTACACTGTTCAAAA 
601 GAATATCCnnnnnAATGGTTAGATAATATAAGAAAAGGATAAGTATGATTAAACTGAAAC 
661 CACGTCGGCAGAAACAAAGTGAATTCCCCCCTTTAGAGGAAGTTCGTTTCTTAAATATAG 
15 721 AAAACAAAGAAGTAGTCGCCTCCCCTTTTAAAATGATCTCAGAAAAACGAGAAGTAAGTA 
781 TAAAAGATATTCAAAATCTACACAGTCAACTAAATTTTACTAATCAAATGCTTTTTCAAT 
841 TAGCAAATAAAAAACAAAAGAAAAAAGmGAAAATTGAAGAAAAATCGTTAATAAAACCAT 
901 TTAAATTCTCAGAAGAAGAGATAAAACAGTTAAAAATTGGTCAAACTTTGGATTCTTTAT 
961 ACGATGAAGTAAAACAAAAGTTATCTATCTCGGTAATAAAAGAAAAACCGAAATCTAATA 
20 1021 ATGATATGCCCAAAAGGACAAATCCAAATCAAGAAGTTTTAGACGAAATCGAAAAGAGAT 
□ 1081 TAAAACAAACTCTGAACGACACAATAAATGTGATAGAAGAAACTAAAAACTCAGACTCAT 
If 1141 GTTC AGAGTCTC CCGATCGT ATTGAAAAAATAAAACGTAATAAATC AGAGATTTC C AGTA 
! * J 1201 AGCCGAAATTTTTACACTCGCCCCACCTTCGATATCATCGAGATGGCGATGGACACCTCA 
ri i 1261 GCATTGATGGAATGGATACTGAGTGATATGATGGATGACAGATGATGAATATAGAAAAAC 
2S 1321 TCACGAAATAACAATGGCCGCTACAGCATATAGAGTAAAACATACCGAGGAACAAACAAT 
l lZ 13 81 AAAATTAATTATATCTGGATTCACGGGAGTATTAAAAGGCTGGTGGGATAATTACCTCAT 
1441 GCCAGAACAAAAGAATTATGTTCTAAGCTGTGTAAAAATAGAAAACGAAGAAGGAATACC 
|= 1 1 1501 ACTAATGGTGGAAACATTGGTGGTAGCAATAATTCATAACTTTATAGGAGATCCAAAGAT 
4 1561 TTTTGAAGAAAGAACATCTTTATTACTTCATAATCTAAGATGTCCAACCTTAGGTGACTT 
30 1621 TAGATGGTATTCAGAAAATTTTTTAGCTATGGTTTTAACAAGGGAAGATTGTAGAGAACC 
M= 1681 TTTCTGGAAAGAACGGTTTATAGCTGGATTACCGGATATCTTTGCTGAAAAGGTAAAAGA 
1741 AAATTTACAAAAGGAATGCCCAAACACCCAATTAAAAGATGTACCATACGGGAAAATAAG 
r; \ 1801 TTCAGTTGTAAAAAATACAGGTCTTCAGTTATGCAATAATATGAAAATAGAAAATAAGAT 
\>% 1861 AAAAAAGAGTGAGAGTCAGGGCATCAAGGAATTAGGGGAATTTTGTACTCAATACGGTTA 
%b 1921 TGAACGAAATACCCCTCCATCAAAAAATAAAAAGAAAATAGCAAAAAGAAGAACAgGGAG 
;;i 1981 AAACAAGCGCTAAAACAAGCGCTAAACCAGCACGTAAAAATTTTAGAAAAACGGTTAATT 
! ^ 2 041 TTAGAAAACCATGAAAGTCTAATGATAAGCCCACTATAGTCTGTTATAAATGTGGACGCA 
2101 TAGGACACATGAAGCGAGACTGTAGACTAAAAGAAAAAATTAGTAATTTGACCATAAGTG 
2161 ATGAATTAAAAGAACAAATGGAAAAACTTCTGATAAATTCCTCCAGAAGAGGAAGAAACA 
40 2221 GAAGAATCAATAGGAGATTCTGATTACGAAGTATTGGACATGAGGATAACAATTGTAATT 
2281 GTGTCTATAAAATAAATACGATAAGTAGTGAATTAAAATTTGCGTTAGATTGCATTGATA 

2 341 AAATTAATAATCCGGAGGAAAAGACCAAAGCCTTAATAGACATGAAAAGGCTACTCGTTG 
24 01 AAAAAGATGAACCCAGTTCATCTTCACAAAAACCTGAATTTATAGGATATGATTTTAAAG 
2461 AAATATTGAGAAAAGCGAAAACATCACATAAAGAAATAACCATTAGCGATCTTAATAGTG 

45 2521 AAATAAATAAATTAAAAGCCGAAATCGAATCTATAAAAGTCGAGCTACAAGAATTAAAAG 
2581 ATAAAATTATACATGAGGAATCCATCTCCTCTGCCGACGAAAATTCACAAGAAGAGGAAG 
2 641 CTAGTAGACCTTCCATCAAAGAAATAACATACAAAAGACAAAAGTGGCATGTAAAAATAG 
2 701 CCCTAGAATTTGTTTGTTTTGTGACCGTTTCATTGTGGTCAAAGATGAGTCCTTACCTAA 
2761 CACAATAAAAAACGTTACTCTTAAATATCAAAGGAGAGCTACAAATATCAATGAATGAAT 

50 2 821 GACATTAATATTTTTCTTTAGTTTTAAAACTTGAATGAGTTGTTTTCATAAATATCTGAC 
2 881 TGACTGACATTTTTATTTTTTCTGAAAATGAGGAAGGTTTATTACGTTAACACCATATAT 

2 941 ATATTTTTATCTCAAAGTCAACGAAATATTATAAAAGAATCAATTAAAAAAAATTATTCT 

3 001 TTTGCAGAAAAAAAAATTAAAAATATGAAACTCCTCCACACCATATTACCATATTATAAA 
3 0 6 1 TATAAAAAAACCTCTCACAAATGTGCATTCTGGAATTCTTTATGTTGAGAGATTAATCTC" 

55 3121 TAAAGAAAAAAGGTTGAGAAAGGTGCAGCAACA ATG TCT CCA TTC TGT AGA 
1 M S P F C R 

3172 AAC TTT TCA ATG GCA TGG GTG CTT ATG GCA TTT GTG TTG TTT 
7NF S MAWVLMAFVL F 

3 214 GCA AAC AGT GCT ATG CCC ACA AAT GGA TCC ACT GTT GGG GTA 
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3550 TAG TTATATAGATTATTCATGTTTCATCTCAATAAAAAAATGACTTTAGAGTGATTCTT 
3609 AGTTTGCTTAACATTCTTACATATTCCTAACTATTCCGTCACTACCACCCGTAACTATAT 
3669 TTATTTAAAATTAGTATCTGTCACAGTTTTATTTTTAAAAAAGGTTATGTGGATTAGAAG 
372 9 AGAGATAAATATGTAGACGGTCACCAACCTTAATTTTTGAACTATGTAAGACTATATTGA 
3789 CCAAGAATATATGTTTAAACTCATTCATTTAAAGACTATATCTCCATTTATGATTATGCA 
3 84 9 AATGCAATTAGTTTTTTTTTTCATTGAAGAATTCAAAAGAAAGTTATCATTAAAAAGTAT 
3 909 CATTAAATCACTTATATGTTGTTTCTTAATATCCTTATTGTTAATAGAATAATTTTTTTT 
3969 ATCCTTTAATTAAGGTTATTACTACTTTTTTTTCATATCTTCATTATTTTGAAATATTTT 
4029 TAAAATTTATCAATTTTTGTAACACCCCAGAAAATACATGTAACTATCACTTTTTTTTTA 
4089 TATTACAAATTTATGACTTATAGAAATACAAATATTAAAAATATAAGGTTCAAAACTACA 
414 9 TCCTAAAGTCTTTCAGACCCTCTGACACATGTATCATCTGCTCGTATATGTGATACAGTC 
4209 ATCGCAGTTCACAAGATAACAAGAAAACCAAGGGTAAGCTAATGAAAAAAAATTCCATAA 
4269 CATATTTAATTCATGCAAAAAGAACCAGTCAAAGTAATCATTTATAAACATTTCTTTAAA 
432 9 TATTGTTATATAAAATTTCAATATCAATTTCATCATTCATATAGACCACACATGGATCTA 
4389 TTTTCAATCACAATCATTGGATTTCATTTTAATCCTACTTCGnCTTCCAGAAGACTCATT 
4449 AAGTATGCCCCTACCAGAGACTAACACCTAATCAAAGAGAAATGATCAAGGTAAGTTCAA 
4509 ACATCCAATAACGAGTGCCTACAGTGGGACCCAATGTGTATGAACTCCTTATCAGCTTCT 
4569 CACCACCTGATATCTTATTCTATATGACGTAGATCATCAGTGAAACTAGAGGATCTCCGT 
4629 TAAACATATGTTTTTTATACTTAATGTCATCAAACAACAACTCACACATTATCCCAAATG 
4689 TATGACATCAATTTCATACAATTTTCATCATTCATATATAATACATATCATTGAATCACA 
474 9 TAACATTTAAAAATTCATACCATTCAAGAACTTTTCCAACATCAAAAGCAATATTTACTT 
4809 TCAAACTATCAAAATATAATTATTATTTAATAAAGCT t 
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TTATCTTATTTCCATATAATTGTTGTTTTACTTTCAAAATTTTTAATTTT 
TTATATTTATCTTTTTACAGTTTAAAATTAATAAAATGAAACTTTTTTTC 
TTAAATGTGTTAAAATATAAAATCAAAAAAGTTGTTATATGGTACATGGC 
ACAATCTTATAAATTATTAATTTGAAAACGATACTTTATATAATAAAATT 
ATCTTAGTTGACATTTTTATTAGTGTTTTCAATCATATTTTTGTTTGCTT 
GATAAGCGTAAAACAAATCAAACTTAACGATACTTTATATAATAAAATTA 
TCTTAGTTGACATTTTTATTAGTGTCTTCAATCATATCTTTGTTTGCTTG 
ATAAGCGTAAAACAAATCAAGTAAAGTTGGGCACCTCAATTGTTTTAAAA 
AAGTTTGGGTACCTCAAAAATTAATAGGTCTTGTCAGATTCTTACAAAAA 
AAATCTGGAAGAATTTATGAAAGAAGGGGGGGGAGGGGGGGAGGGGGGGG 
AAGTGAAGATGAATATTCAACAAAAGAGGGTAGGCATGATGTTAAGTGAG 
TTAAAAAACTATGTTAATGGAGACAATTTTCTGTTAACAAACCCGTTAAT 
TGAAAACGATAGCATTCTTCTCTAACAATGTAAAACGATATTGTTTTATC 
ATAACTACTCATTAAATTTCTGAGTTTCAAATCATATAAAGATTTAGGGG 
GGTGTATTCAATTAAGGATTTGAAATGATTTGTATTAAAATGACAAATCC 
CATGTTATTTCAAACATGAATTGTAAAAACTTTTTTAAAATCAAGTGTTA 
TTAGATTAGTGATTTTAAAATGTACAACCAAACCCACTGTTATTGGAAAC 
ATTTTAAGTAGTGGATTTAAAATGACTTGAGTGATTTTGGGTGGGATTGC 
AGAAAATTTCTTAGTTAAGAATTCAAACATCCAAATCTCATGGTTTCAAG 
TAGAATTTGGGAGAATTTTAATAACAAATCTCCTAATTTACCAAAAGTCA 
CCAAAATCATTTAAAAACTCATTAAAATTTAAATGATTTCAAATCTCCAG 
TTGAATACATCCCCTTGGAATTAGAGATTTTGCTCGATTTGGGACCTAAG 
ATTGAATTTTGGGGATTTAGTTTAATCGTTACAACAAAATGACATCGTAT 
TATTGTTATAGGAAACAATGTCGTTTTCAGTTGACATGTATGTTAATAGA 
AAATTAACTCTATTAACGGGATTTGCTAACCCATTTAACATCGTAACTAA 
ATGGTCAAGTCAATAAAAGTTTGGTATTTATTTGAAAAGTCAACGTAAGT 
TTGATATTTATTTGAAAAGTCAACATAAATTTGATATCTTATTTCGTTTC 
GACAGACATAAGGATTTACATCAATGTTTTTAATAAATTAAAGATTATTA 
TGACATTTTTTCCATTTAAAATTGCCAATGTTTTCGAAACCAAGATACTC 
AAAATTGACATACCTAATTCAATCTACATTTGTTTGACAGCAATTCACGT 
GCCTTGACCACATGGCACATACTGGCAATACATCAATTTTAAGGAAAAGG 
TAGATTCGGATACAATATAATGGAAATAAGTGGAAAGGATCATTGACTAC 
TTGACTTGTAACAAACAACACACAGTATATAACTCATTCGACATTTACAA 
ACAACATTGTGCTAGCTTAAACTCCCTCTCCTATTCAAAAAA ATG 

M 

GAT ATT CCA AAG CAA TAT CTA TCA CTA TTC ATA TTG 
DIPKQYLSLFIL 

ATT ATC TTC ATA ACT ACA AAA TTA TCA CAA GCC GAC 
I IFITTKLSQAD 

CAT AAA AAC GAC ATT CCA GTT CCC AAC GAT CCA TCA 
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GTG GAA ATC AAT AAT GAT CTC GGT AAT CAG CTA ACG 
VEINNDLGNQLT 

TTA CTG TAT CAT TGT AAA TCA AAA GAC GAT GAT TTA 
LLYHCKSKDDDL 

GGT AAC CGG ACT CTG CAA CCA GGT GAG TCG TGG TCT 



N 



E 



W 



TTT AGT TTC GGG CGT CAA TTC TTT GGA AGG ACG TTG 



TAT TTT TGT AGT TTT AGT TGG CCA AAT GAA TCG 
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TCG TTC GAT ATA TAT AAA GAC CAT CGA GAT AGC GGC 
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GGT GAT AAC AAG TGC GAG AGC GAC AGG TGT GTG 
GDNKCESDRCV 

AAG ATA AGA AGA AAC GGA CCT TGT AGG TTT AAC GAT 
KIRRNGPCRFND 



139873 GAA ACG AAG CAG TTT GAT CTT TGT TAT CCT TGG AAT 

146 ETKQFDLCYPWN 

139837 AAA TCT TTG TAT TGA CAACAATATGCTGATGTTCTGTCTTTTAC 

158 K S L Y • 

13 9793 GACTCATGGAGTTTCATTGTTTGAAACAATAATATAAAACATATAAAATT 

139743 TCTATTATTCCAAGTTCCAACTTATAATAATTTGATAATCATATCATATT 

13 9693 ATCATCTTAAGCATTCAATGCTACAAAGATAATACCCCCAAGCTATTTTA 

139643 CATTAAAAGCTGAAACAGAGACACAATACTAACGATAAAAGTTCGTAGTA 

139593 TCTTTATGCAACCATACATACATATACACAAAGATAGACAGGTAGTGTCC 

13 9543 TAATAATTCTACTTGGGTGAGGTATGAACAGCAGCAACAGTAGATACCAT 

1394 93 TGTATCCATACCACACATATTATGAGGCCCTCTGCAGATTTTGTAGTAAC 

13 9443 CATGCTCTCCCCACATCGCTCCCCACGAGTTCTTGATAATCCAA 
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G564 promoter: 
Gain of function constructs 
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Web signal scan Program 

Database searched: PLACE 

url: hiip;//www.dna,affrc,gfljp/htdvK:,VPhAr:Fy 

This is the sequence you submitted 

>G564 promoter (-921 to -662), 450 bases, 3D1A0BF4 checksum. 

TGAAAAGTGAAGAAAACCATGTAATGAAAACAAAATGGCACGACAATCAA 

AAAAAGTTTTCACGCAAAATTTTCTTCAAAATTTATAACATTTTCATGTT 

GTGTTTGTTTCAAAGCCTAGAAAAACGAAGAGTTACTATTGGTAATGAAA 

AGCGAAG AAAACC AC AT AAT AAAAAC AAAATGGC AC GAC AATC AAG AAAA 

AGTTTTCACACAAAACTTTTTTCAAAATTTACTATGTTTATTTCGAAATT 

TAGAAAAACGAAGAGTTATTATTAGTAATGAAAAGCGAAGAAAACTACGT 

AATAAAAAACAAAATGGC AC GAC AATAAAAAAAGTTTTC ACGC AAAATTT 

TCTTGGTGCGCAGAAAGTTATATATATTAATTAATTAATTTTCATTTACT 

TTTTTCCCTTTTTATTTTAAAGTTAAATTATTATTATTTTCATTTAAAAT 

Notation: H = A, C, or T 
R = A or G 
K = G or T 
W = A or T 



RESULTS OF YOUR SIGNAL SCAN SEARCH REQUEST 

/tmp/signalseqdone . 9437 : 4 50 base pairs 
Signal Database File: 



Factor or Site 


Name 


Loc . 


(Str.) Signal Sequence 


SITE # 


-300ELEMENT 


site 
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( + 


TGHAAARK 


S000122 


2SSEEDPROTBANAP 


site 
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CAAACAC 
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ACGTABOX 


3i te 


296 


{ + 


TACGTA 
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ACGTABOX 


site 
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TACGTA 
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TGTGGWWW 


S000169 
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( + 
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■■= -S000028 


C AAT BOX 1 


site 


189 


( + 


CAAT 


■ S000028 


CAATBOX1 


site 


323 


( + 


CAAT 


S000028 


CAATBOX1 


site 


138 




CAAT 


S000028 


CANBNNAPA 


site 


101 




CNAACAC 


S000148 


CCAATBOX1 


site 


138 




CCAAT 


S000030 


CEREGLUBOX2 PSLE 


site 


55 




TGAAAACT 


S000033 


CEREGLUBOX 2 PSLE 


site 


201 




TGAAAACT 


S000033 


CEREGLUBOX2 PSLE 


site 


333 




TGAAAACT 


S000033 


DOFCOREZM 


site 


4 


{ + ! 


AAAG 


S000265 


DOFCOREZM 


site 


53 


{ +; 


AAAG 


S000265 


DOFCOREZM 


site 


112 


( + ; 


AAAG 


S000265 


DOFCOREZM 


site 


149 


( + ! 


AAAG 
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( - ) 
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t - ) 
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f — ) 
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site 
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( — ) 
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site 
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site 
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site 
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( + ) 


AATAAA 


S000080 
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sice 


324 


( + ) 


AATAAA 


S000080 


POLASIG1 


site 


237 




AATAAA 
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s i te 


411 




AATAAA 
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si te 


268 




AATAAT 


S000088 




site 
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AATAAT 
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site 
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site 
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S000083 


pn T ,T M FMl T*E L AT 5 2 


site 
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{ + ) 
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site 
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site 
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s i te 
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s i te 
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s i te 
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POLLENl LELAT5 2 


site 


349 




AGAAA 


S000245 


PYRIMIDINEBOXHV 


site 


400 


{ + ) 


TTTTTTCC 


S000298 


RAV1AAT 


site 


97 
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S000314 


ROOTM0TIFTAPOX1 


site 
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( + ) 


ATATT 


S000098 


SEF4MOTIFGM7S 


site 
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v S0001 03 
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site 
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site 


81 




TATAAAT 


S000109 
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For more information about the SignalScan Program, please contact Dr Dan S. 
Prestridge Tele: (612) 625-3744 Advanced Biosciences Computing Center, fi- 
lial! : danp@biosci .umn . edu 14 79 

Gortner Ave. University of Minnesota St. Paul, MN 55108 

The TFD data is at the gopher site, gopher://genome-gopher.stanford.edu. 
For more information about the WebSignalScan service, please contact Meena 
Sakharkar, meena@biomed.nus.sg, Biolnf ormatics centre, NUS . 



Database Searched: PlantCARE 
URL: lUtp://snhinx.ru ? .ac.be:8080/PlantCARFy 

Sequence submitted: 

>G564 promoter (-921 to -662) 11/21/00 

+ GAAAAGTGAA GAAAACCATG TAATGAAAAC AAAATGGCAC GACAATCAAA AAAAGTTTTC ACGCAAAATT 
+ TTCTTCAAAA TTTATAACAT TTTCATGTTG TGTTTGTTTC AAAGCCTAGA AAAACGAAGA GTTACTATTG 
+ GTAATGAAAA GCGAAGAAAA CCACATAATA AAAACAAAAT GGCACGACAA TCAAGAAAAA GTTTTCACAC 
+ AAAACTTTTT TCAAAATTTA CTATGTTTAT . TTCGAAATTT AGAAAAACGA AGAGTTATTA TTAGTAATGA 
+ AAAGCGAAGA AAACTACGTA ATAAAAAACA AAATGGCACG ACAATAAAAA AAGTTTTCAC GCAAAATTTT 
+ CTTGGTGCGC AGAAAGTTAT ATATATTAAT TAATTAATTT TCATTTACTT TTTTCCCTTT TTATTTTAAA 
+ GTTAAATTAT TATTATTTTC ATTTAAAA 

~| - CTTTTCACTT CTTTTGGTAC ATTACTTTTG TTTTACCGTG CTGTTAGTTT TTTTCAAAAG TGCGTTTTAA 
j - AAGAAGTTTT AAATATTGTA AAAGT AC AAC ACAAACAAAG TTTCGGATCT TTTTGCTTCT CAATGATAAC 
« - CATTACTTTT CGCTTCTTTT GGTGTATTAT TTTTGTTTTA CCGTGCTGTT AGTTCTTTTT CAAAAGTGTG 

: -J - TTTTGAAAAA AGTTTTAAAT GATACAAATA AAGCTTTAAA TCTTTTTGCT TCTCAATAAT AATCATTACT 

ff 

J - TTTCGCTTCT TTTGATGCAT TATTTTTTGT TTTACCGTGC TGTTATTTTT TTCAAAAGTG CGTTTTAAAA 
^ - GAACCACGCG TCTTTCAATA TATATAATTA ATTAATTAAA AGTAAATGAA AAAAGGGAAA AATAAAATTT 

- CAATTTAATA ATAATAAAAG TAAATTTT 
If 3 -AFl_binding_sit 

■i Site Name Organism Position Strand Core Matrix sequence 

simil. simil 

3-AFl_binding_sit ST 2 60 + 1.000 0.860 AAGAgttatt 

Function: 

AAGAA-motif 

Site Name Organism Position Strand Core simil. Matrix simil sequence 

AAGAA-motifAvena sativa 6 + 1.000 0.903 gtgAAGAa 

AAGAA-motif Avena sativa 151 + 1.000 0.870 gcgAAGAa 

AAGAA-motifAvena sativa 284 +• 1.000 0.870 gcgAAGAa 
Function: 

ABRE 

Site Name Organism Position Strand Core simil. Matrix sequence 

simil 

ABRE Hordeum 293 + 9.^? 54 actACGTaat 

— — vulgar e " 



Function: cis-acting element* involved in the abscisic acid 
responsiveness 

ACE 



Site 
Name 



Organism 



Core 

Position Strand simil. 



Matrix 
simil 



ACE 



293 



1.000 



0.908 



Petroselinum 
crispum 

Function: cis-acting element involved in light responsiveness 



sequence 
actACGTaat 



♦ AE-box 
Site Name 

AE-box 

AE-box 



Organism 



Arabidopsis 
thaliana 

Arabidopsis 
thaliana 



Position Strand 



67 



345 



Core 
simil . 

1.000 



1.000 



AE-box Arabidopsis 361 + 1.000 

thaliana 

Function: part of a module for light response 



ATI-motif 
Site Name Organism 

ATI-motif 



Position Strand 



Core 
simil . 

1.000 



Solanum 409 + 

tuberosum 

Function: part of a light responsive module 
Box_4 



Matrix 
simil 

0.852 



sequence 



AGAAaatt 



0.852 AGAAaatt 



0.852 AGAAagtt 



Matrix 
simil 



sequence 



0 . 859 ttttATTTtaaa 



Site NameOrganism 


Position 


Strand 


Core 


simil 


.Matrix simil 


sequence 


Box_4 


PC 


375 


4- 


1 


.000 


1.000 


ATTAat 


Box_4 


PC 


379 


+ 


1 


.000 


1.000 


ATTAat 


Box_4 


PC 


383 




1. 


.000 


1.000 


ATTAat 


Function : 
















Box_I 
















Site NameOrganism 


Position 


Strand 


Core 


simil . 


.Matrix simil 


sequence 


Box_l 


PS 


107 


+ 


1. 


.000 


1.000 


TTTCaaa 


Box_I 


PS 


203 


+ 


1. 


.000 


0. 857 


TTTCaca 


Box_I 


PS 


219 


+ 


1. 


.000 


1.000 


TTTCaaa 


Box__I 


PS 


240 


+ 


1, 


.000 


0.857 


TTTCgaa 


Box_I 


PS 


241 




1. 


.000 


0.857 


TTTCgaa 


Box_I 


PS 


249 




1. 


.000 


0.857 


TTTCtaa 


Function: 
















Box_II 
















Site NameOrganism 


Position 


Strand 


Core 


simil , 


Matrix simil 


sequence 



Box_II .ST 
Box_II AT 
Function: 



139 
161 



1.000 
1.000 



0.889 
0 . 954- 



TGGTaatga 
CCACataat 



CAAT-box 
Site Name Organism 

Hordeum vulgare 



CAAT-box 

CAAT-box 

CAAT-box 
CAAT-box 

CAAT-box 



Arabidopsis 
thaliana 
Hordeum vulgare 
Hordeum vulgare 



Po s i t i onS t rand 
43 + 
137 



188 
322 

351 



Core 
simil . 
1.000 

1.000 

1.000 
1.000 

1.000 



Arabidopsis 
thaliana 

Function: common cis -acting element in promoter and 
ERE 



Matrix sequence 
simil 

1.000 CAAT 

1.000 aCCAAt 

1.000 CAAT 

1.000 CAAT 

0.857 aCCAAg 

enhancer regions 



Site 
Name 



ERE 



ERE 



Organism 

Dianthus 
caryophyllus 

Dianthus 
caryophyllus 



Core 

PositionStrand simil . 



239 



241 



1.000 



1.000 



Matrix 
simil 

0.875 



sequence 
ATTTcgaa 



0.875 ATTTcgaa 



ERE 



Dianthus 
caryophyllus 



413 



1.000 



0.875 ATTTtaaa 



ERE 



Dianthus 
caryophyllus 



441 



" 1.000 



0.875 ATTTaaaa 



ERE Dianthus 442 - 1.000 0.875 ATTTtaaa 

caryophyllus 
Function: ethylene- responsive element 

G-box 



Site NameOrganism Position strand Core simil. Matrix simil sequence 



G-box 


Zea 


mays 


17 


+ 


0 


842 


0 


870 


CATGta 


G-box 


Zea 


mays 


38 




1 


000 


0 


903 


CACGac 


G-box 


Zea 


mays 


94 


+ 


0 


842 


0 


886 


CATGtt 


G-box 


Zea 


mays 


183 


+ 


1 


000 


0 


903 


CACGac 


G-box 


Zea 


mays 


317 


+ 


1 


000 


0 


903 


CACGac 



Function: cis-acting regulatory element involved in light 
respons i venes s 



GC-repeat 



Site Name Organism Position Strand Core simil. Matrix simil sequence 
GC-repeatOryza sativa 351 - 1.000 1.000 gCACCaag 

Function: ? 



Site 
Name 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



HSE 



Organism 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 

Brassica 
oleracea 



PositionStrand 

49 + 

50 + 
52 

66 



Core simil. 
•0.944 



77 



87 



196 
198 
210 
212 
213 
327 
328 
330 
344 
361 
335 



Function: 
I -box 



Brassica 
oleracea 
c is -acting element involved 



0.944 



1.000 



0.833 



1. 000 



0.944 



0.944 



0.944 



0.944 



0.944 



0.944 



0.944 



1.000 



1.000 



Matrix 
simil sequence 

0.878 aAAAAagtt 



0-944 0.912 aAAAAgttt 



0.874 gAAAActtt 



0.978 aGAAAattt 



0.868 aTAAAtttt 



0.853 tGAAAatg L 



0.912 aAAAAgttt 



0.874 gAAAActtt 



0-874 cAAAActtt 



0.944 0.912 aAAAAgttt 



1.000 
in heat stress 



0.878 aAAAAagtt 

0.878 aAAAAagtt 

0.912 aAAAAgttt 

0.874 gAAAActtt 

0.978 aGAAAattt 

0.888 aGAAAgtta 

0.853 tGAAAatta 
responsiveness 



Site 
Name 
I -box 
I -box 
I-box 
I -box 
I-box 
I-box 
I-box 
I-box 
I-box 
I-box 

I-box 



I-box 

I-box 
I-box 
I-box 



Organism Positj 

Pisum sativum 93 

Pisum sativum 162 

Solanum tuberosum 163 

Pisum sativum 237 

Pisum sativum 367 

Pisum sativum 372 

Pisum sativum 391 

Pisum sativum 411 

Pisum sativum 423 

Solanum tuberosum 424 

Arabidopsis 426 
thaliana 

Arabidopsis 429 
thaliana 

Solanum tuberosum 431 

Pisum sativum 433 

Pisum sativum 439 





Core 


Matrix 




Strand 


simil . 


simil 






0.857 


0.883 


aACATga 




Ci QC7 
U . 63 / 


0 .883 


cACATaa 




1.000 


1,000 


tATTAtgt 




0.857 


0.941 


gAAATaa 




1.000 


1.000 


tATATaa 


+ 


1.000 


0.941 


tATATta 




0.857 


0.941 


tAAATga 




0.857 


0.883 


aAAATaa 




0.857 


0.883 


tAAATta 




1.000 


0.903 


aATAAttt 




1.000 


0.863 


aATAAtaat 



1.000 



0.863 aATAAtaat 



Function: part of a light responsive element 
P-box 



1.000 
0 .857 
0.857 



0.951 
0.883 
0.941 



tATTAttt 

aAAATaa 

tAAATga 



Function: gibberellin-responsive element CCTTttt 

Prolamin_box 

Prolamin-boxOryza sativa 278 + v ooo n'o^ tgaAAAGc 

Function: cis-acting regulatory el emen t associated with ^N4 WM ° C 

TATA- box 
Site Name 



Organism 

TATA -box Daucus carota 
TATA- box Brassica juncea 



Position StrandCore simil. 



TATA -box 



TATA -box 



TATA- 
TATA- 
TATA- 
TATA- 
TATA- 
TATA- 
TATA- 
TATA- 
TATA- 
-TATA- 



-box 
■box 
■box 
•box 
box 
box 
box 
box 
box 
box- 



He lianthus 
annuus 

Brassica 
oleracea 
3rassica napus 
Oryza sativa 
Oryza sativa 

Zea mays 
Oryza sativa 
Oryza sativa 
Oryza sativa 
Daucus carota 
Brassica juncea 
Zea-mays 



79 
80 

81 



82 

83 
117 
169 
248 
250 
302 
325 
3 64 
365 
~3"66~ 



1.000 
1.000 

1.000 



1.000 

1.000 
0.818 
0.818 
0.909 
0.818 
0.818 
0.818 
1.000 
1.000 



1.000 



Matrix 
simil 
1.000 
1.000 

1.000 



0.908 

0.892 
0.912 
0.872 
0.879 
0.912 
0.912 
0.912 
0.863 
_Q...8.5.7_ 
0.879 



sequence 

TATAaatt 
TATAaat 

TATAaa 



tTATAac 

gtTATA 

TAGAaaa 

TAAAaac 

TTTAgaaa 
TAGAaaa 
TAAAaaa 
TAAAaaa 

TATAactt 
— TATAact— 

TATAtaac 



TATA -box 
TATA- box 
TATA- box 

TATA -box 

TATA- box 
TATA-box 
TATA -box 
TATA-box 



Oryza sativa 
Oryza sativa 
Oryza sativa 

Solanum 
tuberosum 
Glycine max 
Oryza sativa 
Zea mays 
Zea mays 



367 
368 
369 

370 

372 
407 
413 
442 



*TC-rich_repeats 



- 


1.000 


0.956 


+ 


1.000 " 


0.929 




1.000 


0.929 




1.000 


1.000 


+ 


1.000 


0.891 




. 0.818 


0.872 




0.909 


0.879 


+ 


0.909 


0.879 


around 


-30 of transcription 



Site Name 

TC-rich_repeats 
TC-rich_repeats 
TC-rich_repeats 
TC-rich_repeats 
TC-rich_repeats 
TC-rich_repeats 
TC-rich_repeats 
Function: 



WUN-motif 
Site Name Organism 

WUN-mocif Brassica 
oleracea 

WUN-motif Brassica 
oleracea 



Organism Position strand Core aSjail . 



NT 


7 




NT 


68 


+ 


NT 


152 




NT 


191 




NT 


248 




NT 


285 




NT 


346 


+ 



1.000 
1.000 
1.000 
1.000 
1.000 
1.000 
1.000 



Position StrandCore simil. 



WUN-motif 



Brassica 
oleracea 



WUN-motif Brassica 
oleracea 

WN-monif Brassica 
oleracea 

WUN-motif Brassica 
oleracea 



18 
139 
237 
242 
272 
296 



1.000 

1:000 

0.857 

1.000 

1.000 

0.857 



Matrix 
simil 
0.952 
1.000 
0.909 
0.885 
0.914 
0.909 
0.915 



TATAtaa 
TATAtat 
TATA tat 

TATA t a 

TATAtt 
TAAAaag 
TTTAaaat 
TTTAaaat 



sequence 

gTTTTcttca 
aTTTTcttca 
gTTTTcttcg 
tTTTTcttga 
tTTTTctaaa 
gTTTTcttcg 
aTTTTcttgg 



Matrix sequence 
simil 

0.948 tCATTacat 



1.000 tCATTacca 



0-948 tTATTtcga 



1.000 



aAATTtcga 



Function: wound-responsive element 



C948 tCATTacta 



0.948 tTATTacgt 



